Search results for "Complete-linkage clustering"

showing 4 items of 4 documents

Structural clustering of millions of molecular graphs

2014

We propose an algorithm for clustering very large molecular graph databases according to scaffolds (i.e., large structural overlaps) that are common between cluster members. Our approach first partitions the original dataset into several smaller datasets using a greedy clustering approach named APreClus based on dynamic seed clustering. APreClus is an online and instance incremental clustering algorithm delaying the final cluster assignment of an instance until one of the so-called pending clusters the instance belongs to has reached significant size and is converted to a fixed cluster. Once a cluster is fixed, APreClus recalculates the cluster centers, which are used as representatives for…

Clustering high-dimensional dataFuzzy clusteringTheoretical computer sciencek-medoidsComputer scienceSingle-linkage clusteringCorrelation clusteringConstrained clusteringcomputer.software_genreComplete-linkage clusteringGraphHierarchical clusteringComputingMethodologies_PATTERNRECOGNITIONData stream clusteringCURE data clustering algorithmCanopy clustering algorithmFLAME clusteringAffinity propagationData miningCluster analysiscomputerk-medians clusteringClustering coefficientProceedings of the 29th Annual ACM Symposium on Applied Computing
researchProduct

A Greedy Algorithm for Hierarchical Complete Linkage Clustering

2014

We are interested in the greedy method to compute an hierarchical complete linkage clustering. There are two known methods for this problem, one having a running time of \({\mathcal O}(n^3)\) with a space requirement of \({\mathcal O}(n)\) and one having a running time of \({\mathcal O}(n^2 \log n)\) with a space requirement of Θ(n 2), where n is the number of points to be clustered. Both methods are not capable to handle large point sets. In this paper, we give an algorithm with a space requirement of \({\mathcal O}(n)\) which is able to cluster one million points in a day on current commodity hardware.

CombinatoricsCURE data clustering algorithmSUBCLUNearest-neighbor chain algorithmCorrelation clusteringSingle-linkage clusteringHierarchical clustering of networksGreedy algorithmComplete-linkage clusteringMathematics
researchProduct

An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering

2002

Abstract A generalized prototype-based classification scheme founded on hierarchical clustering is proposed. The basic idea is to obtain a condensed 1-NN classification rule by merging the two same-class nearest clusters, provided that the set of cluster representatives correctly classifies all the original points. Apart from the quality of the obtained sets and its flexibility which comes from the fact that different intercluster measures and criteria can be used, the proposed scheme includes a very efficient four-stage procedure which conveniently exploits geometric cluster properties to decide about each possible merge. Empirical results demonstrate the merits of the proposed algorithm t…

Single-linkage clusteringcomputer.software_genreComplete-linkage clusteringHierarchical clusteringk-nearest neighbors algorithmArtificial IntelligenceNearest-neighbor chain algorithmClassification ruleSignal ProcessingCluster (physics)Computer Vision and Pattern RecognitionData miningMerge (version control)computerSoftwareMathematicsPattern Recognition
researchProduct

SparseHC: A Memory-efficient Online Hierarchical Clustering Algorithm

2014

Computing a hierarchical clustering of objects from a pairwise distance matrix is an important algorithmic kernel in computational science. Since the storage of this matrix requires quadratic space with respect to the number of objects, the design of memory-efficient approaches is of high importance to this research area. In this paper, we address this problem by presenting a memory-efficient online hierarchical clustering algorithm called SparseHC. SparseHC scans a sorted and possibly sparse distance matrix chunk-by-chunk. Meanwhile, a dendrogram is built by merging cluster pairs as and when the distance between them is determined to be the smallest among all remaining cluster pairs. The k…

sparse matrixClustering high-dimensional dataTheoretical computer scienceonline algorithmsComputer scienceSingle-linkage clusteringComplete-linkage clusteringNearest-neighbor chain algorithmConsensus clusteringmemory-efficient clusteringCluster analysisk-medians clusteringGeneral Environmental ScienceSparse matrix:Engineering::Computer science and engineering [DRNTU]k-medoidsDendrogramConstrained clusteringHierarchical clusteringDistance matrixCanopy clustering algorithmGeneral Earth and Planetary SciencesFLAME clusteringHierarchical clustering of networkshierarchical clusteringAlgorithmProcedia Computer Science
researchProduct